Co-evolutionary games on networks 
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We study agents on a network playing an iterated Prisoner's dilemma against their neighbors. 
The resulting spatially extended co-evolutionary game exhibits stationary states which are Nash 
equilibria. After perturbation of these equilibria, avalanches of mutations reestablish a stationary 
state. Scale-free avalanche distributions are observed that are in accordance with calculations from 
the Nash equilibria and a confined branching process. The transition from subcritical to critical 
avalanche dynamics can be traced to a change in the degeneracy of the cooperative macrostate and 
is observed for many variants of this game. 
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I. INTRODUCTION 

Much research has been devoted to the statistical 
physics of complex systems with game theoretic inter- 
actions recently. One motivation are economic systems 
composed of a large number of agents with simple lo- 
cal interactions giving rise to complex global structures 
and dynamics In particular, the problem of stabil- 
ity and uniqueness of equilibria in economic systems has 
been readdressed in the context of the aggregate behav- 
ior of individual agents Q . Game theory || and the the- 
ory of evolutionary games [Q provide a sufficient frame- 
work for modeling individual interactions whereas spa- 
tial structures has to be taken into account to tackle co- 
evolutionary dynamics of real- world systems [?]] . 

Here, we will consider random networks of agents 
which face a social dilemma or, in physical terms, a frus- 
trated interaction. Imagine a situation where each player 
can take two actions, say cooperating or defecting. The 
optimal global outcome would be achieved by all play- 
ers cooperating. But an individual player can gain much 
more when exploiting the cooperators by defecting. Such 
a situation is called a social dilemma or a frustrated inter- 
action. The central question is how social order is possi- 
ble and how cooperative behavior can emerge. Examples 
for such spatially extended dilemmas are biological net- 
works, where connected plants may or may not decide to 
share resources Q , the analysis of internet congestion |j| , 
models for economic communication ]lfj|] , and, of course, 
many sociological problems from conflict research to pub- 
lic transportation Q, [lj] . 

A simple model system is given by the iterated Pris- 
oner's dilemma (IPD) [l3| with co-evolutionary dy- 
namics. The Prisoner's dilemma game is probably the 
most prominent example of a basic model for the emer- 
gence of cooperative behavior in social, economic, and 
biological systems. It provides a frustrated two-particle 
interaction and has been extensively studied by physi- 
cists, economists, biologists, and mathematicians. 
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A spatially extended Prisoner's dilemma was first 
proposed by Axelrod who concluded that territoriality 
strongly influences the evolution of cooperation |p"l| . Ex- 
tensive work on the spatial Prisoner's dilemma started 
in 1992 when Nowak and May explored a cellular au- 
tomaton based on this game on regular lattices. They 
and others found complex spatiotemporal dynamics and 
emergence of cooperation for strategy spaces confined to 
the strategies defecting and cooperating (TJ. |k], [l?], [l^, 
[lj|. For the Prisoner's dilemma on lattices and strategy 
spaces confined to only cooperating and defecting (and 
Tit-For-Tat in |2^]), methods from theoretical physics, 
as Monte-Carlo simulations, percolation theory, the the- 
ory of (nonequilibrium) phase transitions, and the con- 
cept of self-organized criticality, were used to understand 
why cooperators or defectors dominate or coexist in the 
system @ |l], [22[ g ||, ||. Lindgren and Nordahl 
introduced players which act erroneously sometimes, al- 
lowing a complex evolution of strategies in an unbounded 
strategy space [^6| . Others found that the limitation of a 
player's memory to the last encounter, which translates 
to a bounded strategy space, does not provide a signifi- 
cant drawback for the players [^7], |2^, [2{J . Evolutionary 
games on networks, again with only two strategies, have 
been studied to ask questions how spatial organization 
influences the transition from defecting to cooperating 
pof and how the players themselves may influence the 
network topology pl[ . In the following, we will study 
the Prisoner's dilemma on a network with a larger but 
bounded strategy space and co-evolutionary dynamics 
that lead to Nash equilibria as stationary states. It will 
be shown both numerically and theoretically that pay- 
off matrix, strategy space, and topology are crucial to 
answer the question which equilibria will occur and how 
stable they are. In particular, critical avalanches of mu- 
tations are observed for such games and will be explained 
in detail. 

This paper is organized as follows. In Section II, 
the spatially extended iterated Prisoner's dilemma is de- 
scribed as well as its co-evolutionary dynamics. This 
is followed in Section III by numerical investigations of 
avalanche dynamics showing three distinct regimes due 
to changes in payoff matrix and topology. The observed 



2 



Nash equilibria are described and explained in Section 
IV which enables us to understand the critical value of 
the control parameter of the payoff matrix. A confined 
branching process is introduced in Section ^ clarifying 
the relaxation mechanism and the emergence of scale- free 
behavior. Conclusions are drawn in Section |vT| . 



II. A CO-EVOLUTIONARY SPATIALLY 
EXTENDED IPD 



History 


Action 





1 


1 


1 


First move 


1 



TABLE I: Representation of the strategy of one agent with 
memory m = 1 (0: defection, 1: cooperation). The agent 
determined by the above strategy is an unconditioned coop- 
erator. It cooperates no matter whether its opponent has 
cooperated or defected in the last move. 



We start with a network with N players as nodes where 
each player plays an iterated Prisoner's dilemma game 
against each of its neighbors. The Prisoner's dilemma 
is a two-person game with two possible actions in each 
encounter. The payoff matrix of the first player for the 
two strategies cooperating and defecting (denoted by si 
and s 2 ) is given by 

A= (5 l) = ^'''J^ 1 ' 2 )' W 

with the entries a$j = 7ri(§i,Sj). 7Ti(s,,Sj) is the payoff 
of player 1 if player 1 plays strategy §i and player 2 plays 
§j. The game is symmetric, i.e. 7Ti(,§i, Sj) — Tr 2 (sj,Si). 
Therefore, the corresponding payoff matrix of the sec- 
ond player is the transpose of the first player's matrix. 
The Prisoner's dilemma game in general is defined by the 
relations 

a\2 < a 2 2 < an < a 2 i and a\ 2 + a 2 \ < 2an. (2) 

Hence, in one encounter of the Prisoner's dilemma, de- 
fecting is the strategy that yields the best payoff regard- 
less of the opponent's strategy. This is no longer the case 
in the iterated game where mutual cooperation is more 
favorable than both players defecting or switching be- 
tween defecting and cooperating. That is the reason why 
this system is a frustrated system. In each encounter, 
defecting would maximize a player's payoff. But in the 
long run, when players will anticipate each other's action, 
cooperation will in general do much better. 



too, which can be represented as a lookup table or a bi- 
nary string (Tab. |). The finite number of moves of one 
encounter is not known by any agent. In the course of 
the game, one player has to play against each of its neigh- 
bors on the network. Thereafter, its payoff is given by 
the average payoff per move and neighbor. 

The strategy space of a player i consists of up to 8 
pure strategies S t C {0,1,2,3,4,5,6,7} (cf. Tab. for 
definition of the strategies). The pure-strategy space 
of the game is S = x Si with the set of the players 

I = {0,1, ■•■ , n}. The (pure strategy) payoff function 
■Ki : S — > K does not depend on the whole pure strategy 
profile s — (s\, ■ ■ ■ , s n ) but only on the strategies of the 
neighboring nodes 7r$ = iTi{si, neigh(z)) and, of course, on 
the payoff matrix of the Prisoner's dilemma game. Here, 
the set of the neighbors of a node i is denoted neigh(i). 
With 7r(s) = (7Ti(s), ■ • • ,7r„(s)) the above defined game 
(S, 1, 7r) is a finite normal-form game. Such games in 
general have at least one Nash equilibrium ]32|| . Here, 
only pure strategies will be considered neglecting possible 
mixed-strategy equilibria. In this setting, sd = (0, • • • , 0) 
and stft = (6, • • • ,6) are Nash equilibria for any payoff 
matrix A obeying (0) and for a sufficiently high number 
of moves, which can be easily verified. The former equi- 
librium consists of players always defecting, whereas in 
the latter state each player repeats its opponent's last 
move (Tit-For- Tat) with a cooperative opening move (cf. 
Tab. |3 ). 



A. IPD with memory on a network 

Let us further specify the strategy space and payoff 
function of the spatial game. A strategy is viewed as 
a mapping of an agent's "knowledge" to an "action". 
"Knowledge" of an agent is given by the previous moves 
the agent can take into account to decide which action 
it will take next. We define the memory length m of a 
player as the number of these previous moves and con- 
fine the memory of the agents to m — 1, i.e. an agent 
remembers only its opponent player's last action. If one 
player encounters another player it has to decide on its 
first move without any information about the opponent. 
Accordingly, the opening move is part of the strategy, 



B. Co-evolutionary dynamics 

Let us now introduce mutations of a player's strategy. 
The lookup table determining the strategy is viewed as a 
bit string of length 2 m + 1, where m is the memory length 
as defined above. This bit-string will then be mutated 
during the iteration of the game. 

At the beginning, a random network with a given mean 
degree (k) is generated 41 . The strategies are assigned 
randomly, too. All agents play against each of their 
neighbors initially to update their payoffs. Thereafter, 
the following steps are iterated: (i) One agent i is cho- 
sen randomly and its strategy is mutated from to a 
strategy s[ € Si picked out at random, (ii) The mutated 
agent plays again against its neighbors and its payoff is 
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No. 


Strategy 


Acronym 


Bit String 





always defect 


sD 


000 


1 


suspicious anti-Tit-For-Tat 


sATFT 


001 


2 


suspicious Tit-For-Tat 


sTFT 


010 


3 


suspicious cooperate 


sC 


011 


4 


generous defect 


gD 


100 


5 


generous anti-Tit-For-Tat 


gATFT 


101 


6 


generous Tit-For-Tat 


gTFT 


110 


7 


always cooperate 


8'C 


111 



TABLE II: The strategy space of each player consists of up to 
8 different pure strategies comprehending all possible strate- 
gies for a memory of one move. The first lower case letter of 
the acronym describes the first move: "s" for defecting (sus- 
picious) and "g" for cooperating (generous). If the strategy 
is coded as a bit string the assigned numbers correspond to 
the respective binary numbers. 




FIG. 1: Probability distribution P{M) of avalanche size M 
for the subcritical case. The avalanche size M is given by the 
number of mutation events necessary to reestablish an equilib- 
rium. With the temptation to defect in the range 3 < 021 < 4, 
only small avalanches are necessary to reestablish the coop- 
erative equilibrium. The open diamonds show data obtained 
for the spatially extended Prisoner's dilemma averaged over 
50 random networks (N = 200, (k) = 2, a 2 i = 3.5). The 
mechanism of relaxation is a branching process confined to 
the same topology (dashed curve, a — 0.235, cf. Sec. |v|). 



compared to the former result. The mutation is accepted 
in the case of a higher payoff, 7Tj(si , S2, . . . , s[, . . . , sat) > 
7i"i(si) &2, . . . , Si, . . . , sn) , and the payoffs of all neigh- 
bors are also updated. This corresponds to the assump- 
tions that accepting any mutation is combined with some 
costs to the player and that mutations occur on a time 
scale slower than the time scale of the game. Iteration 
of this process leads to a stationary state with a fixed 
strategy distribution. In the stationary state, no agent 
can improve its payoff by changing its strategy whereas 
the other players' strategies remain unchanged. This 
state corresponds to the game theoretic Nash equilibrium 
pgl |3^ |. Note that, for its decisions, no more information 
than a player's own payoff is required. 



III. PERTURBATIONS AND AVALANCHES 

One essential property of evolutionary games is given 
by the equilibria or the evolutionarily stable states. All 
stationary states of the game are Nash equilibria. An 
interesting question is the stability of these equilibria 
against perturbations. In the following, we will study 
the dynamics of avalanches of mutation events follow- 
ing a perturbation of the Nash equilibrium. After the 
system has reached a stationary state, a new strategy 
is assigned to a random player. The insertion of a sub- 
optimal strategy offers new opportunities for mutations 
to the perturbed player itself and to its neighbors. Since 
players are updated randomly, a perturbation leads to an 
avalanche of mutations until a stationary state is reached 
again. One quantity of interest is the avalanche size M 
given by the number of mutations necessary to reestablish 
the equilibrium and its dependence on the payoff matrix 
A. We will first discuss the numerical results for the case 
of players on a random network. The second part of this 
section deals with a Prisoner's dilemma on a ring, which 
will be the starting point for the theoretical treatment in 
the next two sections. 

In the case of sparsely connected random networks, one 
observes three distinct regimes of the avalanche dynam- 
ics, with the temptation to defect 021 as control param- 
eter. For small temptations, 3 < 021 < 4, a subcritical 
regime occurs where large avalanches are suppressed ex- 
ponentially (Fig. Q). For aai > 4, critical behavior occurs 
with avalanche sizes distributed according to a power law 
P(M) oc M~ 7 with the scaling exponent 7 = 1.39 ± 0.10 
(Fig. 0) and a cutoff scaling linearly with system size N. 
This critical regime is followed by a supercritical regime 
for 4.70 < <Z2i < 6 with an enhanced probability of very 
large events (Fig. ||). 

Thus, above a critical value of the temptation to de- 
fect a§i = 4, small perturbations of the system lead 
to long lasting avalanches that affect all players of the 
whole system with a mean avalanche size that diverges in 
the thermodynamic limit. The transition from a regime 
with small avalanches to a critical one with system-wide 
avalanches is robust in case of moderate changes of the 
strategy space S and the mean degree (k). It also oc- 
curs for smaller strategy spaces Si with card(.S'i) > 5 and 
{0, 6, 7} C Si- The qualitative behavior remains even for 
indefinitely iterated games or with a very different payoff 
matrix p2| , which is sometimes used in the context of 
the Prisoner's dilemma 



A = 




(3) 



With Si — {0, 7} and A but quite different evolutionary 
dynamics, Lim, Chem, and Jayaprakash found critical 
avalanches on a two-dimensional square lattice, too [p5| . 

The different regimes of relaxation dynamics can be 
explained by a closer look on the structure of the Nash 



equilibria involved (Sec. tV) as well as on the relaxation 
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FIG. 2: Probability distribution P(M) of avalanche size M 
(number of mutations events) for the critical case on a ran- 
dom network. The subcritical regime is followed by critical 
behavior with 4 < 021 < 4.70. The distribution (open di- 
amonds, average over 50 networks with N = 200, (k) = 2, 
and 021 = 4.5) can be well approximated by a power law 
P(M) oc M ' with 7 = 1.39 ± 0.10 over three orders of mag- 
nitude (solid line). The scale- free behavior can be explained 
by a confined branching process (dashed curve, a = 0.315, cf. 
Sec. 0). 



mechanism which is given by a confined branching pro- 
cess (Sec. 0). Before we start these considerations in 
the next two sections, we briefly discuss the case of a co- 
evolutionary Prisoner's dilemma on a ring, i.e. on the reg- 
ular network where each player's degree is exactly fcj = 2. 
Although this is not a very reasonable model for real spa- 
tially extended systems, it will give some useful insights 
and will allow us to calculate some properties of the spa- 
tially extended game analytically. 

Like for random networks, subcritical, critical, and su- 
percritical regimes occur, with subcritical avalanche dis- 
tributions in the range of 3 < 021 < 4 and supercritical 
behavior for 4.12 < 021 < 6. However, this time the 
critical avalanche distribution has a scaling exponent of 
7 = 1.04 ± 0.05 which significantly differs from the expo- 
nent obtained for random networks (Fig. 



IV. NASH EQUILIBRIA AND THEIR 
DEPENDENCE ON THE PAYOFF MATRIX 



FIG. 3: Probability distribution P(M) of avalanche size M 
(number of mutation events) in the supercritical regime on a 
random network. For high values of the temptation to defect, 
4.7 < 021 < 6, a supercritical distribution of the avalanche 
size is observed (open diamonds, average over 50 networks, 
N = 200, (k) = 2, <22i = 4.7). Again, a confined branch- 
ing process appears to match the relaxation dynamics well 
(dashed curve, a = 0.390, cf. Sec. [v]). 
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FIG. 4: Probability distribution P(M) of avalanche size M 
(number of mutation events) on a ring in the critical regime. 
In the range 4.01 < 021 < 4.11, critical behavior in terms of 
the avalanche distribution is also observed for the Prisoner's 
dilemma on a ring (N = 200, 021 = 4.1). The scale-free dis- 
tribution P(M) oc M -7 has an exponent of 7 = 1.04 ± 0.05 
which is significantly smaller than the scaling exponent ob- 
served for the same game on a random network with identical 
mean degree (k) . The experimental data agree very well with 
the behavior of a branching process confined to a ring (dashed 
curve, a = 0.512, cf. Sec. 



The set of possible stationary states of the co- 
evolutionary Prisoner's dilemma is the set of Nash equi- 
libria which, as has been shown above, contains for all 
a 2 i 6 (3, 6) the defective equilibrium sd = (0, . . . , 0) and 
the Tit-For-Tat equilibrium stft = (6, . . . ,6). One can 
also consider the macrostates of the system correspond- 
ing to the aggregated behavior of the agents. Identifying 
cooperative moves with "spin up" and defecting moves 
with "spin down" the macroscopic behavior at one in- 
stant of time is the magnetization of the system. Thus, 
the strategy profile sd of all agents playing strategy 
corresponds to the minimal magnetization —1 whereas 
the Tit-For-Tat equilibrium stft leads to the maximal 
magnetization +1. 



A. Equilibria on rings and random networks 

Starting with the experimental findings for a ring 
topology, one observes three regimes in terms of adopted 
equilibria which are exactly matched by the three dif- 
ferent regimes of avalanche dynamics. In the subcritical 
regime, the stationary states are a mixture of the strate- 
gies 6 and 7, i.e. generous- Tit-For-Tat players and uncon- 
ditioned cooperators, respectively (Tab. ||). Only gener- 
ous Tit-For-Tat prevails in the critical regime. With the 
onset of supercritical behavior, the defective equilibrium 
sd turns up. Its fraction of the equilibria reached by the 
game grows very fast with further increasing temptation 
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to defect. The first transition can be explained by a sim- 
ple calculation of the Nash equilibria. With 021 < 4, 
the cooperative macrostate is degenerated in many Nash 
equilibria since unconditioned cooperators are stabilized 
by neighbors with the strategy gTFT. Consider a player 
i with its neighbors playing Si-i — 7 (i.e. generous co- 
operate or gC) and Sj+i = 6 (gTFT). Then player i has 
to find a strategy being a compromise between exploiting 
the cooperator at i — 1 and maintaining cooperation with 
its other neighbor, the smarter Tit-For-Tat player at 
However, for a 2 \ < 4, there is no such strategy yielding 
a better payoff than gTFT or even gC. This stabilization 
of the credulously cooperating agents gives rise to a de- 
generacy of the cooperative macrostate in many different 
strategy profiles that are Nash equilibria, diverging faster 
than 2 2N / 3 with the size of the ring. On the other hand, 
if 021 > 4 there always exists such a compromise strat- 
egy and the degeneracy vanishes. That means that below 
the critical value a 2 \ = 4 the macrostate with magneti- 
zation +1 is strongly degenerated in many Nash equilib- 
ria whereas above a 21 there is only one Nash equilibrium 
with maximal magnetization left regardless of system size 
(stft)- The other macrostate with minimal magnetiza- 
tion is never degenerated since sd is the only Nash equi- 
librium that leads to such a defective macrostate. In case 
of regular lattices with different numbers of next neigh- 
bors k, the critical value a 21 is given by 



a§ 1 (fc)=4 



2k + 1 
k + 3 ' 



(4) 



For example, in the case of a two-dimensional lattice 
with periodic boundary conditions and a von-Neumann 
neighborhood, we find subcritical, critical, and supercrit- 
ical behavior, with a 21 ~ 5.14 and a critical exponent of 
7 = 1.3 ± 0.2. Of course, for every payoff matrix A sat- 
isfying (^|) exists a finite value for the number of next 
neighbors k above which the cooperative macrostate is 
always degenerated. However, an and 021 can be ad- 
justed to increase this border to arbitrarily large val- 
ues. Nonetheless, there is a reason why, for every pay- 
off matrix, true critical behavior can only take place in 
sparsely connected networks which will be discussed in 
the next section. Why are only cooperative equilibria 
observed in both the subcritical and the critical range of 
a2i? Looking closer at the way to the equilibrium, there 
are (c 2 (<7 + 1)(<7 — l))/2 transition probabilities for a 
player i changing its strategy, with a := card(Sj). When 
increasing the temptation to defect 021, some of these 
rules change from to a finite value and the respective 
inverse rule vice versa. At the transition to the supercrit- 
ical regime, where the defective equilibrium is reached for 
the first time, exactly those rules change which govern the 
stability of the border between cooperative and defective 
domains. Below that threshold, the cooperative domains 
grow and above it defecting strategies can spread. 

The situation is slightly different on a random network. 
Since there are always some nodes with a degree higher 
than the mean degree (k) , a small degeneracy of the co- 



operative macrostate can exist even for 021 > 
Moreover, disconnected compounds may be in different 
equilibria at the same time. The highly connected nodes 
stabilize the cooperative equilibrium so that even for 
«2i 6 cooperating strategies predominate. Yet these 
degeneracy does not compensate for the loss of coopera- 
tive equilibrium profiles for temptations larger than a 21 
which is the reason for the transition from subcritical to 
critical behavior. The supercritical phase is again caused 
by the change of transition rules leading to increasing 
defective domains with their growth hindered by highly 
connected cooperative nodes. 



B. Nash equilibria and ESS 

As we are dealing with an evolutionary game, the ques- 
tion arises if any of the Nash equilibria is also evolution- 
arily stable. A strategy profile is called an evolutionarily 
stable state (ESS) if it is stable against the insertion of 
a small but finite fraction of mutants playing a different 
strategy |Q, |35|. Therefore, an ESS is a strict Nash equi- 
librium or a non-strict Nash equilibrium with the addi- 
tional condition that other best replies play worse against 
themselves than the ESS strategy against them. Note 
that this concept is formulated for two-person games 
where two players encounter each other by chance. In 
this sense, both Nash equilibria sd = (0, • • ■ , 0) and 
s TFT = (6, ■ ■ • , 6) are no ESS for Si = {0, 1, 2, 3, 4, 5, 6, 7} 
since other best replies score equally well as the equilib- 
rium strategy (strategies 2 and 7, respectively). With 
respect to strategy spaces reduced by 2 or 7, the respec- 
tive (now strict) Nash equilibrium becomes an ESS. But 
does this concept of stability apply to spatially extended 
games? Many approaches to evolutionary stability lead 
to the equivalence of ESS and strict Nash equilibria. So 
one may conjecture that in a game with Si = {0, 1, 5, 6, 7} 
the profile sd should be an ESS since it is a strict Nash 
equilibrium. Yet, as the experiments show, a small per- 
turbation can cause the system to change from the strict 
Nash equilibrium sd to the non-strict cooperative equi- 
librium stft- Thus, when applying the notion of evolu- 
tionary stability to spatially extended systems, one has 
to keep in mind that things may be different here since it 
is the local surrounding that decides whether an invader 
will overthrow the incumbent strategy or will fail instead. 



V. BRANCHING PROCESSES AS A MODEL OF 
THE RELAXATION PROCESS 

Having understood the structure of the Nash equilibria 
and their connection to the transition between different 
regimes of avalanche dynamics, the question remains how 
to explain the distinct form of the probability distribu- 
tions P(M) and in particular the scaling exponents of the 
critical regimes. In fact, the relaxation process can be de- 
scribed by a type of branching process which very well 
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predicts the scaling exponents for the different topolo- 
gies as well as the subcritical and supercritical avalanche 
distributions. 



A. The Galton- Watson process 

The starting point is a simple branching process, also 
known as Galton- Watson process, which will be refor- 
mulated in terms of mutated agents giving rise to future 
mutations of other players. Let Z n be the number of 
mutated players in the n-th generation. Each mutated 
player can cause other mutations in the next generation, 
with the probability p m that its mutation is succeeded by 
m mutations in the next generation. The stochastic pro- 
cess (Z n ) ng N is called a branching process of the Galton- 
Watson type. Note that the number of generations con- 
stitutes a time scale completely different from the time 
scale of the game where at each instant of time one player 
is chosen to mutate its strategy. With the initial condi- 
tion Zq = 1 and the total progeny Z := ^2 i=Q Zi, the 
quantity of interest is the distribution P(Z = r) of to- 
tal progeny or, in other words, the avalanche size. So 
far there is no spatial constraint to this process, i.e. Zi 
is not bounded by the system size, and mutations inde- 
pendently give birth to new mutations. The probability 
p m that a mutation of a player with k neighbors will be 
followed by m mutations in the next generation is given 
by 



Pm = 



1 



l (l-a) fc 



fe+i- 



(5) 



being the simplest choice if a player's mutation can only 
affect its neighborhood including itself. Using generat- 
ing functions |56| we calculate P(Z — t) for this special 
Galton- Watson process with the same ki — k for all play- 
ers to 



P(Z = r) = 



a -I k + 1 



ak 



k 



2irk Va(l-a) fc O + l) fe+1 



„-3/2 



(G) 



It is useful to introduce the mean number of a mutation's 
"children" m = a(k + 1) = EZq and approximate (^) for 
rh < 1 



P{Z = r)= Cr~ 3/2 e ^. 



with 



''() 



k+1 



2k (1-to) 2 ' 



and a constant 



C 



m — k — 1 k + 1 



2nk 



(7) 
(8) 

(9) 



If the expectation of the numbers of descendants ap- 
proaches one, i.e m | 1, the exponential cutoff diverges 



-o — a 



-o — o- z =i 



-O — O z,=2 



B 



-O » O + O z=2 
V^-O • 4 0- z„ +1 =3 

FIG. 5: The branching process on a ring. Each node (circle) 
is occupied by a player with the circle filled if the player's 
strategy has mutated in the respective generation. (A) The 
initially mutated player causes its neighbors and itself to mu- 
tate in the next generation with probability a (arrows). No 
more players are affected since only its neighborhood and the 
player itself can experience a different payoff due to the mu- 
tation. (B) The progenies of mutated players in general are 
not independent of each other. Two mutations can influence 
the same site making the analytical treatment difficult. 



with (1 — fh)~ 2 . The process becomes critical with a 
scale- free avalanche distribution P(Z = r) oc r~ 3 / 2 . If 
m > 1, the probability is finite that Z does not converge 
at all ||^]. The branching process described above, which 
has no spatial constraints, is characterized by a subcriti- 
cal, critical, and supercritical regime of its avalanche dy- 
namics. Although this is very similar to the IPD on a 
random network, in the case of a ring it yields a wrong 
scaling exponent of 7 = 3/2. Such behavior could be 
gained equally well from a random walk of the number 
of mutated sites with drift to a reflecting boundary. In 
the following, we will show that it is the restriction of 
the branching process to the network topology that com- 
pletely explains the dynamics and leads to the correct 
scaling exponents. 



B. Confined branching processes 

The confinement of the branching process leads to two 
effects. First, Z n will be bounded by the system size N; 
second, the mutation events caused by mutated players 
are no longer stochastically independent. We will denote 
a branching process as confined or restricted to a network 
(i) if there exists a one-to-one mapping of players and 
nodes and (ii) if a mutated player can only give birth to 
mutations in its neighborhood including itself (Fig. || A). 
This corresponds to the fact that if a player changes its 
strategy only the payoffs of its neighbors and of the player 
itself will be affected. We assume that each neighbor 
and the mutated site itself has the same probability a 
of mutation in the next generation. With the random 
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Dynamics 


Omfl 


0?mf2 


a 


subcritical 


0.290 


0.234 


0.235 


critical 


0.340 


0.306 


0.315 


supercritical 


0.320 


0.308 


0.390 



TABLE III: The branching parameter a, determined with 
mean-field approaches. The parameter a, obtained for the 
experimental distributions of the different regimes on a ran- 
dom network (Figs. 0, ^, ^|), is compared to mean-field results 
using a random neighborhood and absorbing stable equilib- 
rium states (a m fi) or averaged over realizations of the game 
{axon). 



(n) 

variable Xi, being 1 if the player at node v is mutated 
in generation n and otherwise, the confined process 
(Z' n ) neNo is defined by 



N 



(n) 



(10) 



i/=l 



The probability of a mutation at site v in generation n is 
P{X^ = I) = !-{!- a)\ (11) 

with 



A 



E 

/iSneigh(i/)U{^} 



x( n - 



1) 



(12) 



The confined branching process (Z' n ) can now be used 
to calculate the avalanche distributions of the spatially 
extended Prisoner's dilemma numerically. Applying it to 
random networks, both the subcritical and supercritical 
distributions are matched well (dashed curves in Fig. |l| 
and Fig. |^). The distribution of the confined branching 
process agrees even better with the experimental data in 
the critical regime (dashed curves in Fig. ||). The same 
is true for the Prisoner's dilemma on a ring (Fig. |). In 
both critical cases, the branching process shows the cor- 
rect finite-size scaling of the cutoff which is proportional 
to the system size. Note that the critical regimes of the 
game have different scaling exponents due to network 
topology which are both correctly obtained by the con- 
fined branching process. The critical exponents depend 
only on the topology rather than on the parameter a of 
the process. Therefore, the relaxation mechanism of the 
spatially extended co-evolutionary Prisoner's dilemma is 
a confined branching process. 

Mean-field approaches can be applied successfully to 
explain the parameter a of the confined branching pro- 
cess in the subcritical and critical regime (Tab. III). To 
calculate a mean-field approximation a m n of the branch- 
ing parameter, the transition probabilities of a mutated 
agent's neighbors are determined using a random neigh- 
borhood for both the player and its neighbors. The struc- 
ture of the game is taken into account only by assuming 
that the stable strategies are absorbing states. A second 
approach is to average the transition probabilities over 



game realizations numerically, yielding a m {2- Both val- 
ues, a m fi and cv m f2, agree well with the parameter a ob- 
tained from the avalanche distributions of the subcritical 
and critical regime. This corresponds to the explanation 
that this transition occurs solely because of the change 
in the degeneracy of the cooperative macrostate at the 
critical value a\ Y . The supercritical case is not matched 
by the mean-field approaches which may be due to the 
fact that here the dynamics are governed by local effects, 
i.e. the competitive growth at the boundaries between 
cooperative and defective domains. The dynamics on a 
ring topology can be explained by a similar mean-field 
approach, too, if one assumes that the effective maxi- 
mal number of a player's descendants is approximately 
two and not three. This reduction of potential progeny 
is caused by the strong overlap of the neighborhoods in 
this regular lattice (Fig. || B). 

Although the definition of the confined Galton-Watson 
process is quite intuitive and simple, its analytical treat- 
ment is not. The reason is that mutation events has be- 
come dependent on each other. Two mutations can affect 
the same site in the next generation (Fig. [| B) leading to 
dependent recursive equations ( |IT| , |l2] ) for the mutation 
probability. With the simplification that the Z' n mutated 
sites of generation n are randomly distributed over the 
network, one can shed some light on the critical behav- 
ior of the confined branching process. The conditional 
expectations of the number of mutated players are with 
this assumption 



E(Z' n+1 \Z' n )=fhZ' n (l+0 



(13) 



with 



s 

N 



1-1- 



7 ro AT 



fe+i 



' N 



(14) 



If £ <C 1 and m«l the confined process approximately is 
a martingale for all values of Z' n and should show critical 
behavior. For to <C 1 the avalanche dynamics are sub- 
critical as the process becomes a supermartingale. With 
to 3> 1 obviously resulting in supercritical dynamics, the 
remaining case of interest is to ~ 1. In the event of highly 
connected networks with (k) ^> 1 the correction £ is of 
the order —1 suppressing large avalanches. Thus critical 
avalanche dynamics are expected only for sparsely con- 
nected networks, for too strong dependencies of mutation 
events lead to either subcritical or supercritical distribu- 
tions of avalanche sizes. 



VI. CONCLUSIONS 

In this paper, we have introduced a spatially extended 
Prisoner's dilemma game with co-evolutionary dynamics 
that lead to Nash equilibria as stationary states. We have 
shown that critical avalanche dynamics are characteristic 
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for a broad range of these games. The observed intermit- 
tent evolution with sudden avalanches of activity is remi- 
niscent of self-organized criticality |58], |3£j . Depending on 
the payoff matrix, subcritical, critical, and supercritical 
regimes can be observed. Calculating the Nash equilibria 
and introducing a confined branching process, we were 
able to quantitatively explain the critical value of the 
control parameter, i.e. the temptation to defect, and the 
avalanche distributions. Therefore, investigations on the 
spatially extended Prisoner's dilemma, which has become 
a widely used toy model for the emergence of coopera- 
tion, have to take into account the stability of possible 
equilibria depending on chosen payoff matrix, strategy 
space, and topology. Complex behavior should only be 
found for subcritical or critical dynamics whereas in the 
supercritical regime small perturbations will totally mix 
up the whole system preventing the evolution of local 



structures. The results on the stability of the Nash equi- 
libria and their connection to evolutionarily stable states 
indicate that the concept of equilibrium, originating from 
classical mechanics and brought into the fields of game 
theory and evolution jfO|, has to be further specified to 
take into account co-evolution on networks and other spa- 
tial structures. 
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